Search CORE

84 research outputs found

Association Rules Mining Based Clinical Observations

Author: Hoque Md Tamjidul
Rashid Mahmood A.
Sattar Abdul
Publication venue
Publication date: 01/01/2014
Field of study

Healthcare institutes enrich the repository of patients' disease related information in an increasing manner which could have been more useful by carrying out relational analysis. Data mining algorithms are proven to be quite useful in exploring useful correlations from larger data repositories. In this paper we have implemented Association Rules mining based a novel idea for finding co-occurrences of diseases carried by a patient using the healthcare repository. We have developed a system-prototype for Clinical State Correlation Prediction (CSCP) which extracts data from patients' healthcare database, transforms the OLTP data into a Data Warehouse by generating association rules. The CSCP system helps reveal relations among the diseases. The CSCP system predicts the correlation(s) among primary disease (the disease for which the patient visits the doctor) and secondary disease/s (which is/are other associated disease/s carried by the same patient having the primary disease).Comment: 5 pages, MEDINFO 2010, C. Safran et al. (Eds.), IOS Pres

arXiv.org e-Print Archive

Victoria University Eprints Repository

DisPredict: A Predictor of Disordered Protein Using Optimized RBF Kernel

Author: Hoque Md Tamjidul
Iqbal Sumaiya
Publication venue: ScholarWorks@UNO
Publication date: 01/01/2015
Field of study

Intrinsically disordered proteins or, regions perform important biological functions through their dynamic conformations during binding. Thus accurate identification of these disordered regions have significant implications in proper annotation of function, induced fold prediction and drug design to combat critical diseases. We introduce DisPredict, a disorder predictor that employs a single support vector machine with RBF kernel and novel features for reliable characterization of protein structure. DisPredict yields effective performance. In addition to 10-fold cross validation, training and testing of DisPredict was conducted with independent test datasets. The results were consistent with both the training and test error minimal. The use of multiple data sources, makes the predictor generic. The datasets used in developing the model include disordered regions of various length which are categorized as short and long having different compositions, different types of disorder, ranging from fully to partially disordered regions as well as completely ordered regions. Through comparison with other state of the art approaches and case studies, DisPredict is found to be a useful tool with competitive performance. DisPredict is available at https://github.com/tamjidul/DisPredict_v1.0

Directory of Open Access Journals

PubMed Central

University of New Orleans

PCaAnalyser: A 2D-Image Analysis Based Module for Effective Determination of Prostate Cancer Progression in 3D Culture

Author: Avery Vicky M.
Hoque Md Tamjidul
Lovitt Carrie J.
Windus Louisa C. E.
Publication venue: ScholarWorks@UNO
Publication date: 01/11/2013
Field of study

Three-dimensional (3D) in vitro cell based assays for Prostate Cancer (PCa) research are rapidly becoming the preferred alternative to that of conventional 2D monolayer cultures. 3D assays more precisely mimic the microenvironment found in vivo, and thus are ideally suited to evaluate compounds and their suitability for progression in the drug discovery pipeline. To achieve the desired high throughput needed for most screening programs, automated quantification of 3D cultures is required. Towards this end, this paper reports on the development of a prototype analysis module for an automated high-content-analysis (HCA) system, which allows for accurate and fast investigation of in vitro 3D cell culture models for PCa. The Java based program, which we have named PCaAnalyser, uses novel algorithms that allow accurate and rapid quantitation of protein expression in 3D cell culture. As currently configured, the PCaAnalyser can quantify a range of biological parameters including: nuclei-count, nuclei-spheroid membership prediction, various function based classification of peripheral and non-peripheral areas to measure expression of biomarkers and protein constituents known to be associated with PCa progression, as well as defining segregate cellular-objects effectively for a range of signal-to-noise ratios. In addition, PCaAnalyser architecture is highly flexible, operating as a single independent analysis, as well as in batch mode; essential for High-Throughput-Screening (HTS). Utilising the PCaAnalyser, accurate and rapid analysis in an automated high throughput manner is provided, and reproducible analysis of the distribution and intensity of well-established markers associated with PCa progression in a range of metastatic PCa cell-lines (DU145 and PC3) in a 3D model demonstrated

University of New Orleans

Spiral search: a hydrophobic-core directed local search for simplified PSP on 3D FCC lattice

Author: AA Tantar
Abdul Sattar
Adam Smith
AL Patton
B Berger
C Blum
C Levinthal
C Rohl
C Thachuk
CB Anfinsen
CM Dobso
Duc Nghia Pham
F Glover
F Glover
GW Klau
HJ Böckenhauer
I Dotu
J Lee
K Yue
K Yue
KA Dill
KF Lau
M Cebrián
MA Hakim Newton
MA Rashid
Mahmood A Rashid
Md Tamjidul Hoque
MT Hoque
MT Hoque
MT Hoque
MT Hoque
N Lesh
R Bonneau
R Unger
S Shatabda
Swakkhar Shatabda
T Hales
The Science Editorial
V Cutello
Y Xia
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Critical assessment of protein intrinsic disorder prediction

Author: Aykac-Fas Burcu
Bassot Claudio
Benítez Guillermo Ignacio
Bevilacqua Martina
Bitard-Feildel Tristan
Caid Predictors
Callebaut Isabelle
Chasapi Anastasia
Chemes Lucia Beatriz
Cheng Jianlin
Cozzetto Domenico
Davey Norman
Davidović Radoslav
Disprot Curators
Dosztányi Zsuzsanna
Dunker A. Keith
Elofsson Arne
Erdős Gábor
Galzitskaya Oxana Valerianovna
Gao Jianzhao
González-Foutel Nicolás S.
Govindarajan Sudha
Gsponer Jörg
Guharoy Mainak
Hajdu-Soltész Borbála
Hanson Jack
Hatos András
Hoque Md Tamjidul
Horvath Tamas
Hu Gang
Iglesias Valentin
Iqbal Sumaiya
Jones David T.
Kajava Andrey V.
Kovacs Orsolya Panna
Kurgan Lukasz
Lamb John
Lambrughi Matteo
Lazar Tamas
Leclercq Jeremy Y.
Leonardi Emanuela
Litfin Thomas
Lobanov Michail Yu
Macedo-Ribeiro Sandra
Macossay-Castillo Mauricio
Maiani Emiliano
Malhis Nawar
Manso Jose Antonio
Marino-Buslje Cristina
Martínez-Pérez Elizabeth
Meng Fanchi
Minervini Giovanni
Mirabello Claudio
Mičetić Ivan
Monzon Alexander Miguel
Murvai Nikoletta
Mészáros Bálint
Necci Marco
Orlando Gabriele
Ouzounis Christos
Pajkos Mátyás
Paladin Lisanna
Paliwal Kuldip
Palopoli Nicolás
Pancsa Rita
Papaleo Elena
Parisi Gustavo
Peng Zhenling
Pereira Pedro José Barbosa
Piovesan Damiano
Promponas Vasilis J.
Pujols Jordi
Quaglia Federica
Raimondi Daniele
Salvatore Marco
Schad Eva
Sharma Alok
Sharma Ronesh
Sormanni Pietro
Szabo Beata
Szaniszló Tamás
Tamana Stella
Tantos Agnes
Tompa Peter
Tosatto Silvio C. E.
Veljkovic Nevena
Vendruscolo Michele
Ventura Salvador
Vranken Wim
Wallner Björn
Walsh Ian
Wang Chen
Wang Kui
Wang Sheng
Wu Tianqi
Wu Zhonghua
Xu Jinbo
Yan Jing
Zhou Yaoqi
Álvarez Lucía
Publication venue: Nature Methods
Publication date: 01/01/2021
Field of study

Abstract: Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude

CONICET Digital

HAL-IRD

Diposit Digital de Documents de la UAB

Apollo (Cambridge)

Genetic algorithm for Ab initio protein structure prediction based on low resolution models

Author: Hoque Md Tamjidul
Publication venue: Monash University. Faculty of Information Technology. Gippsland School of Information Technology
Publication date
Field of study

Protein is a sequence of amino acids bounded into a linear chain that adopts a specific folded three-dimensional (3D) shape. This specific folded shape enables protein to perform specific tasks. Amongst various available computational methods, the protein structure prediction by the ab initio approach is promising and can help to unravel the relationship between sequence and its associated structure. This thesis is focused on the ab initio protein structure prediction (PSP), by developing novel Genetic Algorithm (GA) for an efficient and effective conformation search of low resolution models derived from the two-bead hydrophobichydrophilic (HP) models. The thesis also proposes a novel low resolution model, called hHPNX model providing more accurate predictions compared to the existing low resolution HP models. As a search technique, GA shows promise in the complex search landscape for investigating the PSP problem. However, for longer sequences the performance of GA can deteriorate and cause the algorithm to frequently stall or become stuck in local minima. Therefore, in this thesis, a critical analysis of the working principle of GA (i.e., the schemata theorem) is presented. This analysis leads to the generalisation of the schemata theorem. The fallacies in the selection procedure of the schemata theorem are removed and its crossover operation has been fully defined. A novel concept, a chromosome correlation factor (CCF), is proposed to identify similar chromosomes within the GA population, and the optimal value of CCF enables GA to perform effectively and thus helps provide superior results. Further, a non-isomorphic encoding algorithm is proposed for a bijective encoding within GA that prevents the expansion of the search landscape by maintaining a 1:1 relationship between the genotype and the phenotype. The non-isomorphic encoding reduces the chances of GA stalling and also prevents the tendency of the normal stochastic GA search to behave like a random search. Since the PSP solutions are compact in nature, the simple GA developed without any heuristics is further improved as hybrid GA (HGA) by utilising domain-specific knowledge. For an optimal core cavity, we have defined likely sub-conformations to provide guided search. Further, the multi-objective formulation of the search problem can overcome possible stall or stuck conditions by backtracking effectively and performing efficiently. Novel and effective move operators are designed and applied to efficiently move part of the converging compact conformation and thus achieve overall superior results. The simplified HP model and its extension, the HPNX model, are effective in exploring the convoluted PSP search landscape quickly. With its simplicity maintained, the HPNX is extended to a novel model called hHPNX model, which reduces the amount of degeneracy and which additionally captures the characteristics oftwo distinguished amino acids (Alanine and Valine) from the hydrophobic group. A corrected interaction potential matrix for an existing YhHX model is proposed, leading to its correct representation. Further, the facecentred- cube (FCC) model is shown to have the optimal lattice configuration for closely mapping the real folded protein. Three novel techniques are developed to compute the fitness function efficiently, to reduce the computation time. Most importantly, improvement in the speed of computation is achieved without sacrificing the accuracy of the prediction. All the techniques are complementary to each other and can work concurrently thereby reducing the computation time significantly

Estimation of Position Specific Energy as a Feature of Protein Residues from Sequence Alone for Structural Classification.

Author: Md Tamjidul Hoque
Sumaiya Iqbal
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

A set of features computed from the primary amino acid sequence of proteins, is crucial in the process of inducing a machine learning model that is capable of accurately predicting three-dimensional protein structures. Solutions for existing protein structure prediction problems are in need of features that can capture the complexity of molecular level interactions. With a view to this, we propose a novel approach to estimate position specific estimated energy (PSEE) of a residue using contact energy and predicted relative solvent accessibility (RSA). Furthermore, we demonstrate PSEE can be reasonably estimated based on sequence information alone. PSEE is useful in identifying the structured as well as unstructured or, intrinsically disordered region of a protein by computing favorable and unfavorable energy respectively, characterized by appropriate threshold. The most intriguing finding, verified empirically, is the indication that the PSEE feature can effectively classify disorder versus ordered residues and can segregate different secondary structure type residues by computing the constituent energies. PSEE values for each amino acid strongly correlate with the hydrophobicity value of the corresponding amino acid. Further, PSEE can be used to detect the existence of critical binding regions that essentially undergo disorder-to-order transitions to perform crucial biological functions. Towards an application of disorder prediction using the PSEE feature, we have rigorously tested and found that a support vector machine model informed by a set of features including PSEE consistently outperforms a model with an identical set of features with PSEE removed. In addition, the new disorder predictor, DisPredict2, shows competitive performance in predicting protein disorder when compared with six existing disordered protein predictors

Directory of Open Access Journals

PubMed Central

Genetic algorithm for Ab initio protein structure prediction based on low resolution models

Author: Hoque Md Tamjidul (3624425)
Publication venue
Publication date
Field of study

FigShare

Performance of ordered and disordered residue classification based on per residue PSEE value calculated using different contact radius (CR) values.

Author: Md Tamjidul Hoque (487891)
Sumaiya Iqbal (820489)
Publication venue
Publication date
Field of study

Classification performance is shown in terms of (A) ACC (blue bar), (B) PPV (purple bar) and (C) MCC (green bar) for CR values equal to 4 to 30. The x-axis and y-axis show the CR values and the performance metric values, respectively.</p

FigShare